Capture and release based isotope tagged peptides and methods for using the same

ABSTRACT

The invention provides non-affinity based isotope tagged peptides, chemistries for making these peptides, and methods for using these peptides. In one aspect, tags comprise a reactive site (RS) for reacting with a molecule on a protein to form a stable association with the peptide (e.g., a covalent bond) and an anchoring site (AS) group for reversibly or removably anchoring the tag to a solid phase such as a resin support. Anchoring may be direct or indirect (e.g., through a linker molecule). Preferably, the anchoring site comprises a biotin compound. Preferably, the tag comprises a mass-altering label, such as a stable isotope, such that association of the tag with the peptide can be monitored by mass spectrometry. The reagents can be used for rapid and quantitative analysis of proteins or protein function in mixtures of proteins.

RELATED APPLICATIONS

[0001] This application claims priority under 35 U.S.C. §119(e) to No.60/476,511 filed Jun. 6, 2003, the entirety of which is herebyincorporated by reference.

GOVERNMENT GRANTS

[0002] At least part of the work contained in this application wasperformed under government grant HG00041 from the National Institutes ofHealth. The government may have certain rights in this invention.

FIELD OF THE INVENTION

[0003] The invention relates to stable isotope tags and methods of usingthese for quantitative protein expression profiling.

BACKGROUND OF THE INVENTION

[0004] Proteins are essential for the control and execution of virtuallyevery biological process. Protein function is not necessarily a directmanifestation of the expression level of a corresponding mRNA transcriptin a cell, but is impacted by post-translational modifications, such asprotein phosphorylation, and the association of proteins with otherbiomolecules. It is therefore essential that a complete description of abiological system include measurements that indicate the identity,quantity and the state of activity of the proteins which constitute thesystem. The large-scale analysis of proteins expressed in a cell ortissue has been termed proteome analysis (Pennington et al., 1997).

[0005] At present no protein analytical technology approaches thethroughput and level of automation of genomic technology. The mostcommon implementation of proteome analysis is based on the separation ofcomplex protein samples, most commonly by two-dimensional gelelectrophoresis (2DE), and the subsequent sequential identification ofthe separated protein species (Ducret et al., 1998; Garrels et al.,1997; Link et al., 1997; Shevchenko et al., 1996; Gygi et al. 1999;Boucherie et al., 1996). This approach has been revolutionized by thedevelopment of powerful mass spectrometric techniques and thedevelopment of computer algorithms which correlate protein and peptidemass spectral data with sequence databases and thus rapidly andconclusively identify proteins (Eng et al., 1994; Mann and Wilm, 1994;Yates et al., 1995).

[0006] This technology has reached a level of sensitivity which nowpermits the identification of essentially any protein which isdetectable by conventional protein staining methods including silverstaining (Figeys and Aebersold, 1998; Figeys et al., 1996; Figeys etal., 1997; Shevchenko et al., 1996). However, the sequential manner inwhich samples are processed limits the sample throughput, the mostsensitive methods have been difficult to automate and low abundanceproteins, such as regulatory proteins, escape detection without priorenrichment, thus effectively limiting the dynamic range of thetechnique.

[0007] The development of methods and instrumentation for automated,data-dependent electrospray ionization (ESI) tandem mass spectrometry(MS/MS) in conjunction with microcapillary liquid chromatography (LC)and database searching has significantly increased the sensitivity andspeed of the identification of gel-separated proteins. MicrocapillaryLC-MS/MS has been used successfully for the large-scale identificationof individual proteins directly from mixtures without gelelectrophoretic separation (Link et al., 1999; Opitek et al., 1997).However, while these approaches dramatically accelerate proteinidentification, quantities of the analyzed proteins cannot be easilydetermined, and these methods have not been shown to substantiallyalleviate the dynamic range problem also encountered by the 2DE/MS/MSapproach. Therefore, low abundance proteins in complex samples are alsodifficult to analyze by the microcapillary LC/MS/MS method without theirprior enrichment.

[0008] There is thus a need to provide methods for the accuratecomparison of protein expression levels between cells in two differentstates, particularly for comparison of low abundance proteins. ICAT™reagent technology makes use of a class of chemical reagents calledisotope coded affinity tags (ICAT). These reagents exist in isotopicallyheavy and light forms which are chemically identical with the exceptionof eight deuterium or hydrogen atoms, respectively. Proteins from twocells lysates can be labeled independently with one or the other ICATreagent at cysteinyl residues. After mixing and proteolysing thelysates, the ICAT-labeled peptides are isolated by affinity to a biotinmolecule incorporated into each ICAT reagent. ICAT-labeled peptides areanalyzed by LC-MS/MS where they elute as heavy and light pairs ofpeptides. Quantification is performed by determining the relativeexpression ratio relating to the amount of each ICAT-labeled peptidepair in the sample.

[0009] Identification of each ICAT-labeled peptide is performed by asecond stage of mass spectrometry (MS/MS) and sequence databasesearching. The end result is relative protein expression ratios on alarge scale. The major drawback to this technique are 1) quantificationis only relative; 2) specialized chemistry is required, and 3) databasesearches are hindered by the presence of the large ICAT reagentmolecule, and 4) relative amounts of posttranslationally modified (e.g.,phosphorylated) proteins are transparent to analysis.

SUMMARY OF THE INVENTION

[0010] The present invention provides improved chemistry, reagents, andkits for accurate quantification of proteins. In one preferred aspect,proteins can be quantitated directly from cell lysates. The reagents canbe used for the rapid and quantitative analysis of protein in mixturesof proteins, e.g., to profile the proteome of a cell at a particularcell state.

[0011] In another aspect, the invention provides a reagent for massspectrometric analysis of proteins comprising a tag molecule.Preferably, the tag molecule comprises a reactive site for stablyassociating with a protein, an isotope label, and an anchoring site foranchoring the tag molecule to a solid phase. Anchoring may be direct,e.g., as a consequence of a covalent or non-covalent bond between theanchoring site of the tag and the solid phase, or indirect, through alinker which can be cleaved from the tag molecule.

[0012] A particularly useful anchoring site is provided by biotin, whichis well known to complex with avidin. A series of new biotin based catchand release reagents are provided by the invention which comprise abiotin residue and alkylating group which are connected by a linker.Preferred alkylating groups are suitable for alkylating cysteineresidues of polypeptides. Preferred Biotin derivatives comprise biotinand a2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid coupled through a di(2-aminoethyl)ether, which may have one or moreethylene glycol repeat units interposed between the amino residues,e.g., a linker of the formula: —NH((CH₂)₂O)_(n)(CH₂)₂NH—, where n is aninteger of from 0 to about 5.

[0013] When using biotin derivatives in accord with the presentinvention, the tag portion of the reagent is cleavable by pH, or areducing agent, or other means, but not by reversing the affinity bondbetween the biotin and avidin. Thus, although affinity complexing isutilized to attach to the solid phase, the cleavable bond is other thanthe affinity bond. Preferably, the cleavable bond to disassociate thetag is capable of cleaving by a reducing agent. More preferably, thebond cannot be cleaved by a free disulfide, but is cleaved by aphosphine reducing agent such as TCEP or the like.

[0014] In another preferred aspect, the anchoring site of the tagmolecule forms a pH sensitive bond with the solid phase. Preferably, theanchoring site forms covalent bonds to a cis hydroxyl pair on the solidphase under selected pH and reducing conditions and can be disassociatedfrom the solid phase by changing those conditions. Particularlypreferred are bonds that are sterically hindered such that they are notcleaved by free dissulfides but are cleaved by phosphines.

[0015] In another aspect, the tag molecule comprises the general formulaR—B(OH₂), wherein the R group is a suitable chemical moiety forattaching the isotope. Suitable R groups include, but are not limitedto: an alkyl group, aryl group, heteroaryl group, arylalkyl group,heteroarylalkyl group, and a cyclic molecule. In a further aspect, thetag molecule is phenyl-B(OH)₂.

[0016] Preferred isotopes are stable isotopes selected from the groupconsisting of a stable isotope of hydrogen, nitrogen, oxygen, carbon,phosphorous and sulfur.

[0017] Reactive site groups include, but are not limited to chemicalmoieties that react with sulfhydryl groups, amino groups, carboxylategroups, ester groups, phosphate groups, aldehyde groups, ketone groupsand with homoserine lactone after fragmentation with CNBr. Sites onproteins may be naturally reactive with reactive site groups or can bemade reactive upon exposure to an agent (e.g., an alkylating agent, areducing agent, etc).

[0018] In one aspect, the reactive site group of the tag molecule formsa stable association with a modified residue of a protein. The modifiedresidue may be glycosylated, methylated, acylated, phosphorylated,ubiquinated, farnesylated, or ribosylated.

[0019] The pH sensitive anchoring group of a tag molecule forms a bondwith a solid phase under selected pH and reducing conditions. Examplesof sensitive bonds include, but are not limited to: acyloxyalkyl etherbonds, acetal bonds, thioacetal bonds, aminal bonds, imine bonds,carbonate bonds, and ketal bonds. Preferred bonds are the disulfidebonds.

[0020] The invention also provides a composition comprising a pair oftag molecules as described above, where each member of the pair isidentical except for the mass of the isotope attached thereto. Forexample, one member of the pair comprises a heavy isotope and the othermember of the pair comprises the corresponding light form of theisotope. Alternatively, one member of the pair may be labeled while theother member is not.

[0021] The invention further provides a kit comprising reagents and/orcompositions as described above, and one or more of a reagent selectedfrom the group consisting of: an activating agent for providing activegroups on a protein which bind to the reactive site of the tag molecule;a solid phase; one or more agents for lysing a cell; a pH controllingagent; a reducing agent; one or more proteases; one or more cell samplesor fractions thereof. The tag molecule may further be stably associatedwith a peptide. A preferred class of reducing agents are the phosphines,e.g., TCEP.

[0022] Kits of the invention for use of a biotin based reagentpreferably also contain a biotin derivative comprising biotin and a2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid coupled through a di(2-aminoethyl)ether, which may have one or moreethylene glycol repeat units interposed between the amino residues,e.g., a linker of the formula: —NH((CH₂)₂O)_(n)(CH₂)₂NH—, where n is aninteger of from 0 to about 5.

[0023] The invention also provides kits comprising a plurality of taggedpeptide molecules, each tagged peptide molecule comprising a peptide anda tag molecule stably associated with the protein, the tag moleculefurther comprising an isotope label, and a reducing agent sensitiveanchoring site for anchoring the tag molecule to a solid phase. In oneaspect, the kit comprises pairs of tagged peptides and each member of apair of tagged peptides comprises an identical peptide and isdifferentially labeled from the other member of the pair. In anotheraspect, the kit comprises at least one set of tagged peptides, the setcomprising different peptides corresponding to a single protein. Instill another aspect, at least one set of tagged peptides comprisespeptides corresponding to modified and unmodified forms of a singleprotein. In a further aspect, the kit comprises at least one set oftagged peptides from a first cell at a first cell state and at least oneset of tagged peptides from a second cell at a second cell state. Forexample, the first cell may be a normally proliferating cell while thesecond cell is an abnormally proliferating cell (e.g., a cancer cell).First and second cells may also represent different stages of cancer.

[0024] The invention additionally provides a method for identifying oneor more proteins or protein functions in one or more samples containingmixtures of proteins. In one aspect, the method comprises: reacting afirst sample with any of the reagents described above and a solid phaseunder conditions suitable to form a solid phase-isotope labeled tagmolecule-protein complex. The complex is exposed to one or moreproteases, generating solid phase-isotope labeled tag molecule-peptidecomplexes and untagged peptides. The solid phase-isotope labeled tagmolecule-peptide complexes are purified from untagged peptides andexposed to a reducing agent which disrupts associations between theanchoring site of the tag molecule and the solid phase, therebyreleasing tagged peptides from the solid phase. Preferably, the sampleis subjected to a separation step such as liquid chromatography. Themass of the tagged peptide is determined and correlated with theidentity and/or activity of a protein (e.g., the presence of aparticular modified form of a protein which is known to be active).Preferably, a mass-to-charge ratio is determined, e.g., by multistagemass spectrometric (MSn) analysis. In addition to determining theidentity of a protein, a quantitative measure of the amount of proteinin the sample may be obtained. The method may also be used to determinethe site of a modification of a protein in one or more samples, byreacting sample proteins with a tag molecule comprising a reactive sitewhich reacts with a modified residue on the protein. In another aspect,the amount of a modified protein in a sample is also determined.

[0025] In a further aspect, the method further comprises reacting asecond sample with a second reagent comprising an identical moleculartag as the reagent used in the first sample but which is differentiallylabeled. Samples are processed in parallel and combined prior toprotease digestion. This generates a combined sample comprising at leastone pair of tagged peptides, each member of the pair comprisingidentical peptides but differing in mass. The ratio of members of atleast one tagged peptide pair in the combined sample is determined.Preferably, mass spectra are generated. Such spectra will comprise atleast one signal doublet for each peptide in the sample, the signaldoublet comprising a first signal and a second signal shifted a numberof known units from the first signal. The known units will represent thedifference in molecular weight between the two members of a taggedpeptide pair. Preferably, a signal ratio for a given peptide isdetermined by relating the difference in signal intensity between thefirst signal and the second signal.

[0026] The relative amounts of members of a tagged peptide pair in thetwo samples are determined and correlated with the abundance the proteincorresponding to the peptide in the sample. Abundance may be correlatedwith the state of cells from which the samples were obtained. Thecorrelation may be used to diagnose a pathological condition in apatient from whom one of the cell samples was obtained (e.g., where oneof the cell states represent a disease condition).

[0027] Single samples or multiple samples may be analyzed by relatingmass spectra data from a tagged peptide to an amino acid sequence. Thesteps of the method can be repeated, either sequentially orsimultaneously, until substantially all of the proteins in a sample aredetected and/or identified. In this way a proteome profile for one ormore cells can be obtained.

BRIEF DESCRIPTION OF THE FIGURES

[0028] The objects and features of the invention can be betterunderstood with reference to the following detailed description andaccompanying drawings.

[0029]FIG. 1 is a schematic diagram illustrating the use of resin-basedchemistries to tag peptides according to one aspect of the invention.

[0030]FIG. 2 shows exemplary cleavable linkers that can be used in themethod shown in FIG. 1.

[0031]FIG. 3 shows the use of arylboronic acids for proteinquantification according to one aspect of the invention.

[0032]FIG. 4 shows the elution profile for a carbohydrate affinitycolumn demonstrating pH sensitive attachment of boron-based tagmolecules.

[0033]FIGS. 5A and B show two strategies for capturing and labelingcysteine-containing peptides. FIG. 5A shows the use of a boron-basedmolecular tag which binds to a resin support comprising cis hydroxygroups presented by a 5-membered cyclic ring compound via the twohydroxy groups on the tag. The tag binds to proteins via a cysteinereactive moiety. FIG. 5B shows the use of the 5-membered cyclic ring asthe tag molecule and the use of R—B(OH₂) as the molecule which presentscis hydroxy groups to capture the tag molecule.

[0034]FIG. 6 is a synthetic protocol for preparing biotin basedchemistries to tag peptides according to one aspect of the invention.

[0035]FIG. 7 is an HPLC trace of the reaction mixture in the preparationof biotin derivative IV.

[0036]FIG. 8 is an LC-MS spectrogram of the peak corresponding to biotinderivative (IV) in the HPLC trace of FIG. 7.

[0037]FIG. 9 is a reverse phase HPLC trace of the reaction mixture ofExample 3.

[0038]FIG. 10 is the MS spectrogram of the active site (residues200-221) of human protein tyrosine phospatase IB (PTPIB) having asequence ESGSLSPEHGPVVVHCSAGIGR where [M+H]⁺¹=2176.4 and[M+2H]⁺²=1088.7.

[0039]FIG. 11 is an HPLC trace of Example 4 in which the peak at 14.24minutes corresponds to the conjugate of PTPIB.

[0040]FIG. 12 is a MS spectrogram of the reaction mixture in thesynthesis of the conjugate of PTPIB.

[0041]FIG. 13 a MS spectrogram of the reaction mixture in the synthesisof the conjugate of PTPIB after reduction with TCEP.

[0042]FIG. 14 is a HPLC trace of the purified conjugate of PTPIB.

[0043]FIG. 15 is a MS spectrogram sampling the peak at 11.81 minutes inthe HPLC trace of FIG. 14.

[0044]FIG. 16A illustrates the formula for a preferred catch and release(CAR) reagent for protein profiling.

[0045]FIG. 16B illustrates a tagged protein after cleavage from thereagent at the disulfide bond for tag and C¹³ labeled tag.

DETAILED DESCRIPTION OF THE INVENTION

[0046] The invention provides non-affinity based isotope taggedpeptides, chemistries for making these peptides, and methods for usingthese peptides. In one aspect, tags comprise a reactive site (RS) forreacting with a molecule on a protein to form a stable association withthe peptide (e.g., a covalent bond) and an anchoring site (AS) group forreversibly or removably anchoring the tag to a solid phase such as aresin support. Anchoring may be direct or indirect (e.g., through alinker molecule). Preferably, the tag comprises a mass-altering label,such as a stable isotope, such that association of the tag with thepeptide can be monitored by mass spectrometry. The reagents can be usedfor rapid and quantitative analysis of proteins or protein function inmixtures of proteins.

[0047] Definitions

[0048] The following definitions are provided for specific terms whichare used in the following written description.

[0049] As used in the specification and claims, the singular form “a”,“an” and “the” include plural references unless the context clearlydictates otherwise. For example, the term “a cell” includes a pluralityof cells, including mixtures thereof. The term “a protein” includes aplurality of proteins.

[0050] “Protein”, as used herein, means any protein, including, but notlimited to peptides, enzymes, glycoproteins, hormones, receptors,antigens, antibodies, growth factors, etc., without limitation.Presently preferred proteins include those comprised of at least 25amino acid residues, more preferably at least 35 amino acid residues andstill more preferably at least 50 amino acid residues.

[0051] As used herein, the term “peptide” refers to a compound of two ormore subunit amino acids. The subunits are linked by peptide bonds.

[0052] As used herein, the term “alkyl” refers to univalent groupsderived from alkanes by removal of a hydrogen atom from any carbon atom:C_(n)H_(2n+1)—. The groups derived by removal of a hydrogen atom from aterminal carbon atom of unbranched alkanes form a subclass of normalalkyl (n-alkyl) groups: H[CH₂]_(n)—. The groups RCH₂—, R₂CH— (R notequal to H), and R₃C— (R not equal to H) are primary, secondary andtertiary alkyl groups respectively. C(1-22)alkyl refers to any alkylgroup having from 1 to 22 carbon atoms and includes C(1-6)alkyl, such asmethyl, ethyl, propyl, iso-propyl, butyl, pentyl and hexyl and allpossible isomers thereof. By “lower alkyl” is meant C(1-6)alkyl,preferably C(1-4)alkyl, more preferably, methyl and ethyl.

[0053] As used herein, the terms “aryl” and “heteroaryl” mean a 5- or6-membered aromatic or heteroaromatic ring containing 0-3 heteroatomsselected from O, N, or S; a bicyclic 9- or 10-membered aromatic orheteroaromatic ring system containing 0-3 heteroatoms selected from O,N, or S; or a tricyclic 13- or 14-membered aromatic or heteroaromaticring system containing 0-3 heteroatoms selected from O, N, or S; each ofwhich rings is optionally substituted with 1-3 lower alkyl, substitutedalkyl, substituted alkynyl, —NO₂, halogen, hydroxy, alkoxy, OCH(COOH)₂,cyano, —NZZ, acylamino, phenyl, benzyl, phenoxy, benzyloxy, heteroaryl,or heteroaryloxy; each of said phenyl, benzyl, phenoxy, benzyloxy,heteroaryl, and heteroaryloxy is optionally substituted with 1-3substituents selected from lower alkyl, alkenyl, alkynyl, halogen,hydroxy, alkoxy, cyano, phenyl, benzyl, benzyloxy, carboxamido,heteroaryl, heteroaryloxy, —NO₂ or —NZZ (wherein Z is independently H,lower alkyl or cycloalkyl, and -ZZ may be fused to form a cyclic ringwith nitrogen).

[0054] “Arylalkyl” means an alkyl residue attached to an aryl ring.Examples are benzyl, phenethyl and the like.

[0055] “Heteroarylalkyl” means an alkyl residue attached to a heteroarylring. Examples include, e.g., pyridinylmethyl, pyrimidinylethyl and thelike.

[0056] “Substituted” alkyl groups mean alkyls where up to three H atomson each C atom therein are replaced with halogen, hydroxy, lower alkoxy,carboxy, carboalkoxy, carboxamido, cyano, carbonyl, —NO₂, —NZZ;alkylthio, sulfoxide, sulfone, acylamino, amidino, phenyl, benzyl,heteroaryl, phenoxy, benzyloxy, heteroaryloxy, or substituted phenyl,benzyl, heteroaryl, phenoxy, benzyloxy, or heteroaryloxy.

[0057] An “amide” refers to an —C(O)—NH—, where Z is alkyl, aryl,alklyaryl or hydrogen.

[0058] A “thioamide” refers to —C(S)—NH—Z, where Z is alkyl, aryl,alklyaryl or hydrogen.

[0059] An “ester” refers to an—C(O)—OZ′, where Z′ is alkyl, aryl, oralklyaryl.

[0060] An “amine” refers to a—N(Z′)Z″, where Z′ and Z″, is independentlyhydrogen, alkyl, aryl, or alklyaryl, provided that Z′ and Z″ are notboth hydrogen.

[0061] An “ether” refers to Z-O-Z, where Z is either alkyl, aryl, oralkylaryl.

[0062] A “thioether” refers to Z-S-Z, where Z is either alkyl, aryl, oralkylaryl.

[0063] A “cyclic molecule” is a molecule which has at least one chemicalmoiety which forms a ring. The ring may contain three atoms or more. Themolecule may contain more than one cyclic moiety, the cyclic moietiesmay be the same or different.

[0064] Tag Molecules

[0065] Generally, tag molecules according to the invention comprise theformula:

AS—R*—RS,

[0066] where RS represents a reactive site group for reacting with aprotein or peptide, AS represents an anchoring site group for stablyassociating the tag with a solid phase and R represents the backbone ofthe tag molecule to which the isotope label (*) is attached. As usedherein, “stable” refers to an association which remains intact afterextensive and multiple washings with a variety of solutions to removenon-specifically bound components.

[0067] The tag can be stably associated with a solid phase (SP) eitherdirectly as

SP-AS—R*—RS,

[0068] where “—” between SP and AS represents a covalent bond.Preferably, this bond is pH sensitive.

[0069] Alternatively, the tag can be stably associated with the solidphase as

SP-L-AS—R*—RS, or

SP-AS-L-R*—RS,

[0070] where L is a cleavable linker molecule with at least one cleavagesite which can separate the linker from the tag molecule.

[0071] Reactive Site Groups

[0072] The reactive site of a tag molecule is a group that selectivelyreacts with certain protein functional groups or is a substrate orcofactor of an enzyme of interest. Preferably, the reactive group of thetag molecule reacts with a plurality of different types of cellularproteins. Reaction of the RS of the tag molecule with functional groupson the protein should occur under conditions that do not lead tosubstantial degradation of the compounds in the sample to be analyzed.Examples of RS groups include, but are not limited to those which reactwith sulfhydryl groups to tag proteins containing cysteine, those thatreact with amino groups, carboxylate groups, ester groups, phosphatereactive groups, and aldehyde and/or ketone reactive groups or, afterfragmentation with CNBr, with homoserine lactone.

[0073] Cysteine reactive groups include, but are not limited to,epoxides, alpha-haloacyl groups, nitriles, sulfonated alkyl or arylthiols and maleimides. Amino reactive groups tag amino groups inproteins and include sulfonyl halides, isocyanates, isothiocyanantes,active esters, including tetrafluorophenyl esters, andN-hydroxysuccinimidyl esters, acid halides, and acid anyhydrides. Inaddition, amino reactive groups include aldehydes or ketones in thepresence or absence of NaBH₄ or NaCNBH₃.

[0074] Carboxylic acid reactive groups include amines or alcohols whichbecome reactive in the presence of a coupling agent such asdicyclohexylcarbodiimide, or 2,3,5,6-tetrafluorophenyl trifluoroacetateand in the presence or absence of a coupling catalyst such as4-dimethylaminopyridine; and transition metal-diamine complexesincluding Cu(II)phenanthroline.

[0075] Ester reactive groups include amines which, for example, reactwith homoserine lactone.

[0076] Phosphate reactive groups include chelated metal where the metalis, for example Fe(III) or Ga(III), chelated to, for example,nitrilotriacetiac acid or iminodiacetic acid.

[0077] Aldehyde or ketone reactive groups include amine plus NaBH₄ orNaCNBH₃, or these reagents after first treating a carbohydrate withperiodate to generate an aldehyde or ketone.

[0078] RS groups can also be substrates for a selected enzyme ofinterest. The enzyme of interest may, for example, be one that isassociated with a disease state or birth defect or one that is routinelyassayed for medical purposes. Enzyme substrates of interest for use withthe methods of this invention include, acid phosphatase, alkalinephosphatase, alanine aminotransferase, amylase, angiotensin convertingenzyme, aspartate aminotransferase, creatine kinase,gamma-glutamyltransferase, lipase, lactate dehydrogenase, andglucose-6-phosphate dehydrogenase which are currently routinely assayedfor.

[0079] Anchoring Sites

[0080] The tags according to the invention further comprise an anchoringsite for forming stable associations with a solid phase. Tags are eitherreversibly anchored (e.g., can associate and dissociate from the solidphase depending on solution conditions, such as pH) or removablyanchored (e.g., can be disassociated from the support but unable toreattach under any condition). Stable associations can include covalentor non-covalent bonds and, and as discussed above, may be direct (i.e.,the tag may bind covalently or non-covalently to the solid phase) orindirect (i.e., the tag may bind covalently or non-covalently to alinker molecule which itself forms direct stable associations with thesolid phase). In this latter scenario, the anchoring site of the tagmolecule is the site on the molecule which stably associates with thelinker. In one preferred aspect, tags are anchored to solid supports bypH sensitive covalent bonds. In another preferred aspect, tags areanchored to solid supports by bonds cleavable with a reducing agent,preferably a phosphine agent, e.g., TCEP.

[0081] Tags according to the invention bind minimally and preferably,not at all, to components in the assay system, except the solid phase,and do not significantly bind to surfaces of reaction vessels. Anynon-specific interaction of the affinity tag with other components orsurfaces should be disrupted by multiple washes that leave associationbetween the tag and solid phase intact. The tag preferably does notundergo peptide-like fragmentation during (MS)^(n) analysis. The tag ispreferably soluble in the sample liquid to be analyzed even thoughattached to a solid phase comprising an insoluble resin such as agarose.

[0082] The tag molecule preferably also contains groups or moieties thatfacilitate ionization of tagged peptides. For example, the tag moleculemay contain acidic or basic groups, e.g., COOH, SO₃H, primary, secondaryor tertiary amino groups, nitrogen-heterocycles, ethers, or combinationsof these groups. The tag molecule may also contain groups having apermanent charge, e.g., phosphonium groups, quaternary ammonium groups,sulfonium groups, chelated metal ions, tetralky or tetraryl borate orstable carbanions.

[0083] Cleavable Linkers

[0084] In one aspect, a tag is associated indirectly with a solid phasethrough a linker molecule. As used herein, a “linker” refers to abifunctional chemical moiety which comprises an end for stablyassociating with a solid phase and an end for stably associating withthe tag. In one preferred aspect, the linker is cleavable. As usedherein, the term “cleavage” refers to a process of releasing a materialor compound from a solid support, e.g., to permit analysis of thecompound by solution-phase methods. See, e.g., Wells et al. (1998), J.Org. Chem. 63:6430-6431.

[0085] The linker group should be soluble in the sample liquid to beanalyzed and should be stable with respect to chemical reaction, e.g.,substantially chemically inert, with respect to components of thesample. Preferably, the linker does not interact with the tag moleculeexcept at the tag molecule's anchoring site and does not interact withthe support except at the end of the linker which forms stableassociations with the support. Any non-specific interactions of thelinker should be broken after multiple washes which leave the solidphase:linker:tag molecule (±peptide) complex intact. Linkers preferablydo not undergo peptide-like fragmentation during (MS)_(n) analysis.

[0086] Exemplary linker molecules are shown in FIG. 2. As can be seenfrom the Figure, the exact chemical structure of the linker can vary toallow cleavage to be controlled in a manner suiting a particular assayformat and to allow coupling to a particular tag molecule. Thus, thelinker can be cleavable by chemical, thermal or photochemical reaction.Photocleavable groups in the linker may include, but are not limited to,1-(2-nitrophenyl)-ethyl groups. Thermally labile linkers may include,but are not limited to, a double-stranded duplex formed from twocomplementary strands of nucleic acid, a strand of a nucleic acid with acomplementary strand of a peptide nucleic acid, or two complementarypeptide nucleic acid strands which will dissociate upon heating.

[0087] Cleavable linkers also include those having disulfide bonds, acidor base labile groups, including among others, diarylmethyl ortrimethylarylmethyl groups, silyl ethers, carbamates, oxyesters, ethers,polyethers, diamines, ether diamines, polyether diamines, amides,polyamides, polythioethers, disulfides, silyl ethers, alkyl or alkenylchains (straight chain or branched and portions of which may be cyclic)aryl, diaryl or alkyl-aryl groups, amides, polyamides, and esters.Enzymatically cleavable linkers include, but are not limited to,protease-sensitive amides or esters, beta-lactamase-sensitivebeta-lactam analogs and linkers that are nuclease-cleavable, orglycosidase-cleavable.

[0088] Although normally amino acids and oligopeptides are notpreferred, when used they typically will employ amino acids of from 2-3carbon atoms, i.e. glycine and alanine. Aryl groups in linkers cancontain one or more heteroatoms (e.g., N, O or S atoms). Linkages alsoinclude substituted benzyl ethers, esters, acetals or ketals, diols, andthe like (See, U.S. Pat. No. 5,789,172 for a list of usefulfunctionalities and manner of cleavage, herein incorporated byreference). The linkers, when other than a bond, will have from about 1to 60 atoms, usually 1 to 30 atoms, where the atoms include C, N, O, S,P, etc., particularly C, N and O, and will generally have from about 1to 12 carbon atoms and from about 0 to 8, usually 0 to 6 heteroatoms.The atoms are exclusive of hydrogen in referring to the number of atomsin a group, unless indicated otherwise.

[0089] The series of new biotin based ICAT reagents are provided by thepresent are particularly useful linkers. These linkers readily formcomplexes with avidin in solution or attached to a solid phase. Asaforesaid, such reagents comprise a biotin residue and alkylating groupwhich are connected by a bond cleavable by a reducing agent withoutdisassociating the biotin side from the solid support. Preferredalkylating groups are suitable for alkylating cysteine residues ofpolypeptides. Preferred biotin derivatives comprise biotin and a2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid coupled through a di(2-aminoethyl)ether, which may have one or moreethylene glycol repeat units interposed between the amino residues,e.g., a linker of the formula: —NH((CH₂)₂O)_(n)(CH₂)₂NH—, where n is aninteger of from 0 to about 5.

[0090] Additional types of linker molecules are described in, e.g.,Backes and Ellman (1997) Curr. Opin. Chem. Biol. 1:86-93, Backes et al.(1996), J. Amer. Chem. Soc. 118:3055-3056, Backes and Ellman (1994), J.Amer. Chem. Soc. 116:11171-11172, Hoffmann and Frank (1994), TetrahedronLett. 35:7763-7766, Kocis et al. (1993), Tetrahedron Lett. 34:7251-7252,and Plunkett and Ellman (1995), J. Org. Chem. 60:6006-6007.

[0091] In contrast to affinity-based tag molecules, such as ICA™reagents, tag molecules stably associated with linker molecules aregenerally not displaceable from the solid phase by addition of adisplacing ligand or by changing solvent, and the cleavage site of thelinker is generally distal from the support and proximal to the tagmolecule.

[0092] In preferred embodiments of the present invention using biotinderivatives, the affinity complex is used to bind the tag to the solidsupport but not to release the tag.

[0093] pH and Reduction Sensitive Anchoring Sites

[0094] In another aspect, the tag comprises a molecule with a pH and/orreduction sensitive anchoring site. Examples of such tags are shown inFIG. 2. In one preferred aspect, such a tag minimally comprisesR—B(OH₂), where the R group is a suitable chemical moiety for attachinga label such as a stable isotope. In one embodiment, R is a source of πelectrons, i.e., is sp2-bonded to B. Therefore, preferably, R is anaromatic group such as a phenyl molecule. An exemplary tag moleculeincludes, but is not limited to, phenyl-B(OH)₂.

[0095] Additionally, the tag molecule comprises an RS group, preferably,covalently bound to the R group and distal from the —OH anchor sitegroups. In one preferred embodiment, the RS group comprises acysteine-reactive moiety such as the group shown in FIG. 2. However,generally, any of the RS groups described above may also be used as RSgroups.

[0096] Additional molecules may present between the RS group and Rgroup; however, preferably, the tag molecule is of a suitable size tofacilitate mass spectrometric analysis.

[0097] Though boron may be supplied in a variety of ways, it must bepresent as borate ions in order to bind to a solid phase support (e.g.,such as a polysaccharide-containing support). According to D. J. Doonanand L. D. Lower (“Boron Compounds (Oxide, Acids, Borates)”, inKirk-Othmer Encyclopedia of Chemical Technology, Vol. 4, p. 67-110, 3rded., 1978), boric acid, borate ion and polyions containing variousamounts of boron, oxygen, and hydroxyl groups exist in dynamicequilibrium where the percentage of each of the species present isdictated mainly by the pH of the solution. Borate ion begins to dominatethe other boron species present in the fluid at a pH of approximately9.5 and exceeds 95% of total boron species present at a pH of about11.5. According to B. R. Sanderson (“Coordination Compounds of BoricAcid” in Mellor's Comprehensive Inorganic Chemistry, p. 721-764, 1975),boron species (including borate ions and boric acid among others) reactwith di- and poly-hydroxyl compounds having a cis-hydroxyl pair to formcomplexes which are in rapid equilibrium with uncomplexed boron speciesand the cis-hydroxyl compounds. The relative amounts of the complexedand free materials are provided by the equilibrium constants for thespecific systems. The equilibrium constants for borate ion is severalorders of magnitude larger (typically by factors of 10⁴ to 10¹⁰) thanthe equilibrium constant for boric acid with the same cis-hydroxylcompound.

[0098] For all practical purposes, borate ions form complexes (i.e., canserve to crosslink polysaccharides), while boric acid does not.Therefore, in order to have a useable crosslinked solid phase with theminimum boron content, most of the boron must be present as borate ionswhich requires a pH of at least about 8.5, preferably at least about9.5. Reducing pH below these levels will reversibly break covalent bondsbetween the hydroxyl groups of the borate ions and the solid phase.

[0099] Additional tag molecules with pH sensitive anchoring sitesinclude molecules with pH sensitive bonds such as acyloxyalkyl ether,acetal, thioacetal, aminal, imine, carbamate, carbonate, and/or ketalbonds. Solid phases comprising silyl groups additionally can form pHsensitive bonds with hydroxyl, carboxylate, amino, mercapto, orenolizable carbonyl groups on tag molecules.

[0100] Particularly useful reduction sensitive bonds are stericallyhindered dissulfide bonds, particularly such bonds that are cleavable bya phosphine reducing agent, e.g., TCEP.

[0101] In contrast to tag molecules in the art comprising affinity tags(e.g., such as ICAT™ reagents), tag molecules comprising pH and/orreduction sensitive anchoring sites generally retain the functionalgroup that binds to the solid phase when disassociated from the solidphase (e.g., by a change in pH, or by a reducing agent). The smallersize of non-affinity based tag molecules such as those containingboronic acid groups facilitates the analysis of tagged peptides byMS^(n).

[0102] Types of Labels

[0103] The type of label selected is generally based on the followingconsiderations:

[0104] The mass of the label should preferably unique to shift fragmentmasses produced by MS analysis to regions of the spectrum with lowbackground. The ion mass signature component is the portion of thelabeling moiety which preferably exhibits a unique ion mass signature inmass spectrometric analyses. The sum of the masses of the constituentatoms of the label is preferably uniquely different than the fragmentsof all the possible amino acids. As a result, the labeled amino acidsand peptides are readily distinguished from unlabeled amino acids andpeptides by their ion/mass pattern in the resulting mass spectrum. In apreferred embodiment, the ion mass signature component imparts a mass toa protein fragment produced during mass spectrometric fragmentation thatdoes not match the residue mass for any of the 20 natural amino acids.

[0105] The label should be robust under the fragmentation conditions ofMS and not undergo unfavorable fragmentation. Labeling chemistry shouldbe efficient under a range of conditions, particularly denaturingconditions and the labeled tag preferably remains soluble in the MSbuffer system of choice. In one aspect, the label increases theionization efficiency of the protein, or at least does not suppress it.Alternatively or additionally, the label contains a mixture of two ormore isotopically distinct species to generate a unique massspectrometric pattern at each labeled fragment position.

[0106] In one preferred aspect, tags comprise mass-altering labels whichare stable isotopes. In certain preferred embodiments, the methodutilizes isotopes of hydrogen, nitrogen, oxygen, carbon, phosphorous orsulfur. Suitable isotopes include, but are not limited to, ²H, ¹³C, ¹⁵N,¹⁷O, ¹⁸O or ³⁴S. Pairs of tags can be provided, comprising identical tagand peptide portions but distinguishable labels. For example, a pair oftags can comprise isotopically heavy and isotopically light labels,e.g., such as a ¹⁶O:¹⁸O pair or ²H:¹H.

[0107] Types of Solid Phases

[0108] Examples of solid supports suitable for the methods describedherein include, but are not limited to: glass supports, plastic supportsand the like. These terms are intended to include beads, pellets, disks,fibers, gels, or particles such as cellulose beads, pore-glass beads,silica gels, polystyrene beads optionally cross-linked withdivinylbenzene and optionally grafted with polyethylene glycol andoptionally functionalized with amino, hydroxy, carboxy, or halo groups,grafted co-poly beads, poly-acrylamide beads, latex beads,dimethylacrylamide beads optionally cross-linked with N,N′-bis-acryloylethylene diamine, glass particles coated with hydrophobic polymer, andthe like, e.g., material having a rigid or semi-rigid surface; andsoluble supports such as low molecular weight non-cross-linkedpolystyrene.

[0109] However, in one preferred aspect, the solid phase is a resin. Asused herein, a “resin” refers to an insoluble material (e.g., apolymeric material) or particle which allows ready separation fromliquid phase materials by filtration. Resins can be used to carry tagsand/or tagged peptides. Suitable resins include, but are not limited to,agarose, guaracrylamide, carbohydrate-based polymers (e.g.,polysaccharide-containing), and the like.

[0110] A “functionalized” solid phase or “functionalized resin” refersto an insoluble, polymeric material or particle comprising active sitesfor reacting with the anchoring site of a tag molecule allowing anchoredtag molecules to be readily separated (by filtration, centrifugation,etc.) from excess reagents, soluble reaction by-products or solvents.See also, Sherrington (1998), Chem. Commun. 2275-2286, Winter, InCombinatorial Peptide and Non-Peptide Libraries (G. Jung, ed.), pp.465-509. VCH, Weinheim (1996), and Hudson (1999) J Comb. Chem.1:330-360.

[0111] In one aspect, a functionalized solid phase comprises a reactivegroup for stably associating with a cleavable linker such as a linkershown in FIG. 2.

[0112] In another aspect, a functionalized solid phase comprises cishydroxy groups preferably attached by, a cyclic ring to the sold phase,or another chemical group suitable for forming a stable covalentassociation with an alkyl or aryl boronic acid, such as phenyl-B(OH)₂.In one aspect, the solid phase comprises a cyclic alkane, such as1,2-dihydroxycyclohexane. Preferably, the cyclic alkane comprises a5-membered ring (see, e.g., FIG. 5A).

[0113] In a further aspect, shown in FIG. 5B, the cyclic alkane is usedas a molecular tag while R—B(OH)₂ molecules are used to capture the tagmolecules.

[0114] In another particularly useful alternative, a solid phasematerial is functionalized by attaching avidin molecules, which readily,reversibly complex with biotin ICAT reagents of the invention.

[0115] Methods of Using Non-Affinity Based Isotope Tags

[0116] Isolated tagged peptides according to the invention can be usedto facilitate quantitative determination by mass spectrometry of therelative amounts of proteins in different samples. Also, the use ofdifferentially isotopically-labeled reagents as internal standardsfacilitates quantitative determination of the absolute amounts of one ormore proteins present in the sample. Samples that can be analyzed bymethod of the invention include, but are not limited to, cellhomogenates; cell fractions; biological fluids, including, but notlimited to urine, blood, and cerebrospinal fluid; tissue homogenates;tears; feces; saliva; lavage fluids such as lung or peritoneal lavages;and generally, any mixture of biomolecules, e.g., such as mixturesincluding proteins and one or more of lipids, carbohydrates, and nucleicacids such as obtained partial or complete fractionation of cell ortissue homogenates.

[0117] Preferably, a proteome is analyzed. By a proteome is intended atleast about 20% of total protein coming from a biological sample source,usually at least about 40%, more usually at least about 75%, andgenerally 90% or more, up to and including all of the protein obtainablefrom the source. Thus the proteome may be present in an intact cell, alysate, a microsomal fraction, an organelle, a partially extractedlysate, biological fluid, and the like. The proteome will be a mixtureof proteins, generally having at least about 20 different proteins,usually at least about 50 different proteins and in most cases, about100 different proteins or more.

[0118] Generally, the sample will have at least about 0.05 mg ofprotein, usually at least about 1 mg of protein or 10 mg of protein ormore, typically at a concentration in the range of about 0.1-10 mg/ml.The sample may be adjusted to the appropriate buffer concentration andpH, if desired.

[0119] Using Cleavable Linkers

[0120]FIG. 1 demonstrates one proposed strategy for quantitatingproteins in a sample. Suitable samples, include but are not limited tocell lysates, purified or partially purified proteins. However, theinvention is particularly advantageous in that it allows proteinquantification to be performed directly from cell lysates, thusminimizing the number of sample processing steps required and maximizingthroughput, an essential feature of proteome analysis.

[0121] In the scheme shown in the Figure, proteins from cells arecontacted with an agent (e.g., an alkylating agent) to activate one ormore reactive groups on the protein so as to render these one or moregroups reactive with RS groups on the tag molecule. In one aspect, thetag molecule is stably associated with a solid phase prior to reactingwith cellular proteins, or can be reacted with cellular proteins firstand then stably associated the solid phase. In one aspect, the tagmolecule comprises a linker molecule and is bound via the linkermolecule to the solid phase. Alternatively, the solid phase comprisesthe linker molecule and that tag molecule is contacted with the solidphase immobilized linker molecule before or after contacting the tagmolecule with the solid phase and linkers. It should be obvious to thoseof skill in the art that the exact sequence of events can vary and thatsuch variations are encompassed within the scope of the invention.

[0122] As shown in FIG. 1, the net result is the formation of a solidphase-linker-tag-protein complex. In the example shown in the Figure,the solid phase is a resin particle (R) and the linker comprises acleavage site.

[0123] The complex is exposed to a protease, generating solidphase-linker-tag-peptide complexes along with untagged peptides.Suitable proteases include, but are not limited to one or more of:serine proteases (e.g., such as trypsin, hepsin, SCCE, TADG12, TADG14);metallo proteases (e.g., such as PUMP-1); chymotrypsin; cathepsin;pepsin; elastase; pronase; Arg-C; Asp-N; Glu-C; Lys-C; carboxypeptidasesA, B, and/or C; dispase; thermolysin; cysteine proteases such asgingipains, and the like. Generally, the type of protease is notlimiting; however, preferably, the protease is an extracellularprotease. In cases in which the steps prior to protease digestion wereperformed in the presence of high concentrations of denaturingsolubilizing agents, the sample mixture is diluted until the denaturantconcentration is compatible with the activity of the proteases used.

[0124] Untagged peptides and other sample components are washed away.The remaining solid phase-linker-tag-peptide complexes are exposed to acleavage stimulus (e.g., a chemical agent, reducing agent, light, heat,an enzyme, etc.) and the solid phase-linker portion of the complex isseparated from the tag-peptide portion of the complex. Tagged peptidesare subsequently analyzed by an appropriate method such as LC-MS/MS,discussed further below.

[0125] Preferably, stable isotopes are incorporated into tag moleculesprior to contacting the tag with sample proteins.

[0126] In one particularly preferred aspect, proteins are obtained fromcells in two different states (e.g., cells which are cancerous andnon-cancerous, cells at two different developmental stages, cellsexposed to a condition and cells unexposed to the condition, etc) andare activated (e.g., alkylated) for reaction with the RS groups of tagmolecules. Following activation, the two cell samples are incubated withtag molecules labeled with stable isotopes, linker molecules, and solidphases (in any sequence as described above) under suitable conditions toallow solid phase-linker-tag-protein complexes to form. Preferably, tagsin the two sample tubes are labeled with different labels (e.g., heavyand light isotopes).

[0127] The samples are combined in the same tube and then proteolyzed(e.g., trypsinized) and peptides which are not immobilized on the solidphase are removed by washing. Peptides are cleaved from the resin byvirtue of the cleavable linker (e.g., using 50 mM DTT for adisulfide-based linker) and stable isotopes are retained with thepeptides. These provide the means for quantification in a massspectrometer members of a peptide pair differ in mass by the exactamount of mass contributed by the stable isotope. Identical peptidepairs comprise members with heavy and light isotopes or comprise alabeled member and unlabeled member. Peptide sequencing of either memberof the pair can be performed by tandem mass spectrometry to identify theparent protein from which the peptide was obtained. This can be repeatedon a global scale utilizing only seconds to measure and sequence eachpeptide. By determining ratios of labeled and unlabeled ordifferentially labeled peptides, the parent protein can be quantitatedin each sample. Thus, protein expression profiles can be obtained forwhole cell lysates which include information identifying andquantitating each protein member in the sample.

[0128] Use of pH Sensitive Anchoring Sites on Tag Molecules

[0129] A scheme for using tag molecules comprising pH sensitiveanchoring sites is shown in FIG. 2. In one aspect, proteins areactivated for reaction with RS groups of the tag molecule. Where theRS-group is a cysteine reactive moiety, disulfide bonds of proteins in asample are reduced to free SH groups using a reducing agent (e.g., suchas tri-n-butylphosphine, mercaptoethylamine, dithiothreitol, and thelike). If required, this reaction can be performed in the presence ofsolubilizing agents including high concentrations of urea and detergentsto maintain protein solubility.

[0130] The proteins are contacted with suitable tag molecules, such asfor example a biotin ICAT reagent or a RS—R—B(OH₂) molecule, underconditions suitable for forming stable associations between the RS groupand the activated proteins of the sample. Tag-protein complexes arereacted with one or more proteases (e.g., such as trypsin) to generatetag-peptide complexes and untagged peptides. Tagged peptides arecontacted with a solid phase under conditions suitable for formingstable associations with the solid phase and untagged peptides arewashed away. As above, the order of contacting with the solid phase canbe varied. For example, tag molecules can be bound to the solid phaseprior to contacting with proteins in a sample. Preferably, the pH isabout 8.5 or higher, to maintain covalent bonding between the tagmolecule and the solid phase during the contacting steps and wash steps.Reactions generally can be performed at room temperature.

[0131] The pH of the sample is reduced to less than about 8.5, andpreferably to less than a pH of 3, to remove the tagged peptide from thesupport. As above, tagged peptides may subsequently be analyzed byLC-MS/MS. Also, as above, parallel samples contacted with differentiallylabeled tags can be combined for protease digestion steps, purificationof tagged molecules, and subsequent analysis by LC-MS/MS to determineratios of labeled tagged peptides in the combined sample. Optimalconditions (e.g., pH and temperature) for removing tag molecules may bedetermined using an assay such as described in Example 1.

[0132] Quantitation of Proteins in Samples

[0133] Whether using either the cleavable linker scheme or the pHsensitive anchoring site scheme, quantitation of proteins involves thesame general principals. For the comparative analysis of severalsamples, one sample is designated a reference to which the other samplesare related to. Typically, the reference sample is labeled with theisotopically heavy reagent and the experimental samples are labeled withthe isotopically light form of the reagent, although this choice ofreagents is arbitrary.

[0134] After tagging, aliquots of the samples labeled with theisotopically different reagents (e.g., heavy and light reagents, orlabeled and unlabeled reagents) are combined and all the subsequentsteps are performed on the pooled samples. Combination of thedifferentially labeled samples at this early stage of the procedureeliminates variability due to subsequent reactions and manipulations.Preferably equal amounts of each sample are combined.

[0135] Following protease digestion and purification of tagged peptidesin a combined sample, the mixture of proteins is submitted to aseparation process, which preferably, allows the separation of theprotein mixture into discrete fractions. Each fraction is preferablysubstantially enriched in only one labeled protein of the proteinmixture. The methods of the present invention are utilized in order toidentify and/or quantify and/or determine the sequence of a taggedpeptide. Within preferred embodiments of the invention, the taggedpeptide is “substantially pure,” after the separation procedure whichmeans that the polypeptide is about 80% homogeneous, and preferablyabout 99% or greater homogeneous. Many methods well known to those ofordinary skill in the art may be utilized to purify tagged peptides.Representative examples include HPLC, Reverse Phase-High Pressure LiquidChromatography (RP-HPLC), gel electrophoresis, chromatography, or any ofa number of peptide purification methods as are known in the art.

[0136] A preferred purification method is microcapillary liquidchromatograph.

[0137] Analysis of isolated, tagged peptides by microcapillary LC-MSN orCE-MSN with data dependent fragmentation is performed using methods andinstrument control protocols well-known in the art and described, forexample, in Ducret et al., 1998; Figeys and Aebersold, 1998; Figeys etal., 1996; or Haynes et al., 1998. Also encompassed within the scope ofthe invention, although less preferred, are mass spectrometry methodssuch as fast atomic bombardment (FAB), plasma desorption (PD),thermospray (TS), and matrix assisted laser desorption (MALDI) methods.

[0138] In the analysis step, both the quantity and sequence identity ofthe proteins from which the tagged peptides originated can be determinedby automated multistage MS (MSn). This is achieved by the operation ofthe mass spectrometer in a dual mode in which it alternates insuccessive scans between measuring the relative quantities of peptideseluting from the capillary column and recording the sequence informationof selected peptides. Peptides are quantified by measuring in the MSmode the relative signal intensities for pairs of peptide ions ofidentical sequence that are tagged with the molecules comprising lightor heavy forms of isotope, respectively, or labeled and unlabeledmembers of a peptide pair, and which therefore differ in mass by themass differential encoded within the labeled tagged reagent.

[0139] Peptide sequence information is automatically generated byselecting peptide ions of a particular mass-to-charge (m/z) ratio forcollision-induced dissociation (CID) in the mass spectrometer operatingin the MS^(n) mode. (Link, A. J. et al., 1997; Gygi, S. P., et al. 1999;and Gygi, S. P. et al., 1999). The resulting CID spectra are thenautomatically correlated with sequence databases to identify the proteinfrom which the sequenced peptide originated. Combination of the resultsgenerated by MS and MS^(n) analyses of labeled tagged peptide samplestherefore determines the relative quantities, as well as the sequenceidentities, of the components of protein mixtures in a single, automatedoperation.

[0140] The approach employed herein for quantitative proteome analysisis based on two principles. First, a short sequence of contiguous aminoacids from a protein (5-25 residues) contains sufficient information touniquely identify that protein. Protein identification by MS^(n) isaccomplished by correlating the sequence information contained in theCID mass spectrum with sequence databases, using computer searchingalgorithms known in the art (Eng, J. et al., 1994; Mann, M. et al.,1994; Qin, J. et al., 1997; Clauser, K. R. et al., 1995). Pairs ofidentical peptides tagged with the light and heavy affinity taggedreagents, or labeled and unlabeled peptides, respectively, (or inanalysis of more than two samples, sets of identical tagged peptides inwhich each set member is differentially isotopically labeled) arechemically identical and therefore serve as mutual internal standardsfor accurate quantitation.

[0141] The MS measurement readily differentiates between peptidesoriginating from different samples, representing for example differentcell states, because of the difference between isotopically distinctreagents attached to the peptides. The ratios between the intensities ofthe differing weight components of these pairs or sets of peaks providean accurate measure of the relative abundance of the peptides (and hencethe proteins) in the original cell pools because the MS intensityresponse to a given peptide is independent of the isotopic compositionof the reagents (De Leenheer, A. P. et al (1992).

[0142] Several beneficial features of the method are apparent. At leasttwo peptides can be detected from each protein in a pooled samplemixture. Therefore, both quantitation and protein identification can beredundant. Further, where the peptide group which reacts with the RSgroup of a tag molecule is relatively rare (e.g., such as a cysteinylresidue), the presence of such a group in a tagged peptide adds anadditional powerful constraint for database searching (Sechi, S. et al.,1998). The use of relatively rare peptide groups and the tagging andselective enrichment for peptides containing these groups significantlyreduces the complexity of the peptide mixture generated by theconcurrent digestion of multiple proteins and facilitates MS^(n)analysis. For example, a theoretical tryptic digest of the entire yeastproteome (6113 proteins) produces 344,855 peptides, but only 30,619 ofthese peptides contain a cysteinyl residue. Additionally, thechemistries used in both schemes discussed above are compatible withLC-MS/MS analysis.

[0143] The methods described above, generally start with about 100 μg ofprotein and require no fractionation techniques. However, the methodsare compatible with any biochemical, immunological or cell biologicalfractionation methods that reduce sample complexity and enrich forproteins of low abundance while quantitation is maintained. This methodcan be redundant in both quantitation and identification if multiplegroups on a single protein bind to an RS group of a tag molecule.

[0144] The methods of this invention can be applied to analysis of lowabundance proteins and classes of proteins with particularphysico-chemical properties including poor solubility, large or smallsize and extreme p/values.

[0145] An application of the chemistry and described above is theestablishment of quantitative profiles of complex protein samples andultimately total lysates of cells and tissues.

[0146] In addition, the reagents and methods of this invention may beused to determine sites of protein modifications and therefore theabundance of modified proteins in a sample. For example, in one aspect,when the RS group reacts with a modified residue on a protein,differentially isotopically labeled tagged peptides are used todetermine the sites of induced protein modification. Modified peptidesare identified in a protease-digested sample mixture by fragmentation inthe ion source of an ESI-MS instrument and their relative abundances aredetermined by comparing the ion signal intensities of an experimentalsample with the intensity of an included, isotopically labeled standard.Modifications included within the scope of the invention include, butare not limited to, glycosylation, methylation, acylation,phosphorylation, ubiquination, farnesylation, and ribosylation.

[0147] In one aspect, the RS group is a Boron tag of reversed polarity,that is the two hydroxyl groups of R—B(OH₂) are exposed in solution tobind to glycosylated peptides. In this scenario, the Boron tag isattached to the solid phase, SP, via another type of molecule such as acatechol group.

[0148] In another aspect, a cyclic alkane comprising cis hydroxy groupsare used as tag molecules while an R—B(OH₂) molecule is attached to asupport and used to capture the tag molecules (see, e.g., FIG. 5).

[0149] In still another aspect, a biotin with an alkylating group isused as a tag molecule. The tag portion is cleaved preferably through adisulfide bond from the biotin portion, which is attached to the supportthrough an avidin complex.

[0150] Quantitative Analysis of Surface Proteins in Cells and Tissue

[0151] The cell exterior membrane and its associated proteins (cellsurface proteins) participate in sensing external signals and respondingto environmental cues. Changes in the abundance of cell surface proteinscan reflect a specific cellular state or the ability of a cell torespond to its changing environment. Thus, the comprehensive,quantitative characterization of the protein components of the cellsurface can identify marker proteins or constellations of markerproteins characteristic for a particular cellular state, or explain themolecular basis for cellular responses to external stimuli. Indeed,changes in expression of a number of cell surface receptors such asHer2/neu, erbB, IGFI receptor, and EGF receptor have been implicated incarcinogenesis and a current immunological therapeutic approach forbreast cancer is based on the infusion of an antibody (Herceptin,Genentech, Palo Alto, Calif.) that specifically recognizes Her2/neureceptor.

[0152] Cell surface proteins are also experimentally accessible.Diagnostic assays for cell classification and preparative isolation ofspecific cells by methods such as cell sorting or panning are based oncell surface proteins. Thus, differential analysis of cell surfaceproteins between normal and diseased (e.g., cancer) cells can identifyimportant diagnostic or therapeutic targets. While the importance ofcell surface proteins for diagnosis and therapy of cancer has beenrecognized, membrane proteins have been difficult to analyze. Due totheir generally poor solubility they tend to be under-represented instandard 2D gel electrophoresis patterns and attempts to adapt 2Delectrophoresis conditions to the separation of membrane proteins havemet limited success. The method of this invention can overcome thelimitations inherent in the traditional techniques.

[0153] Methods can be applied to enhance the selectivity for taggedpeptides derived from cell surface proteins. For example, tagged cellsurface proteins can be protease-digested directly on the intact cellsto generate tagged peptides, purified and analyzed as discussed above.In addition, traditional cell membrane preparations may be used as aninitial step to enrich cell surface proteins. These methods can includegentle cell lysis with a dounce homogenizer and series of densitygradient centrifugations to isolate membrane proteins prior toproteolysis. This method can provide highly enriched preparations ofcell surface proteins. In the application of the methods of thisinvention to cell surface proteins, once the tagged proteins arefragmented, the tagged peptides behave no differently from the peptidesgenerated from more soluble samples.

[0154] Methods according to the invention can be used for qualitativeand/or quantitative analysis of global protein expression profiles incells and tissues, i.e., analysis of proteomes. The method can also beemployed to screen for and identify proteins whose expression level incells, tissue or biological fluids is affected by a stimulus (e.g.,administration of a drug or contact with a potentially toxic material),by a change in environment (e.g., nutrient level, temperature, passageof time) or by a change in condition or cell state (e.g., disease state,malignancy, site-directed mutation, gene knockouts) of the cell, tissueor organism from which the sample originated. The proteins identified insuch a screen can function as markers for the changed state. Forexample, comparisons of protein expression profiles of normal andmalignant cells can result in the identification of proteins whosepresence or absence is characteristic and diagnostic of the malignancy.

[0155] The methods herein can be employed to screen for changes in theexpression or state of enzymatic activity of specific proteins. Thesechanges may be induced by a variety of compounds or chemicals, includingpharmaceutical agonists or antagonists, or potentially harmful or toxicmaterials. The knowledge of such changes may be useful for diagnosingabnormal physiological responses and for investigating complexregulatory networks in cells.

[0156] Compounds which can be evaluated include, but are not limited to:drugs; toxins; proteins; polypeptides; peptides; amino acids; antigens;cells, cell nuclei, organelles, portions of cell membranes; viruses;receptors; modulators of receptors (e.g., agonists, antagonists, and thelike); enzymes; enzyme modulators (e.g., such as inhibitors, cofactors,and the like); enzyme substrates; hormones; nucleic acids (e.g., such asoligonucleotides; polynucleotides; genes, cDNAs; RNA; antisensemolecules, ribozymes, aptamers), and combinations thereof. Compoundsalso can be obtained from synthetic libraries from drug companies andother commercially available sources known in the art (e.g., including,but not limited, to the LeadQuest® library) or can be generated throughcombinatorial synthesis using methods well known in the art. A compoundis identified as a modulating agent if it alters the expression or siteof modification of a polypeptide and/or if it alters the amount ofmodification by an amount that is significantly different from theamount observed in a control cell (e.g., not treated with compound)(setting p values to <0.05).

[0157] Compounds identified as modulating agents are used in methods oftreatment of pathologies associated with abnormal sites/levels of theparticular modification. For administration to a patient, one or moresuch compounds are generally formulated as a pharmaceutical composition.Preferably, a pharmaceutical composition is a sterile aqueous ornon-aqueous solution, suspension or emulsion, which additionallycomprises a physiologically acceptable carrier (i.e., a non-toxicmaterial that does not interfere with the activity of the activeingredient). More preferably, the composition also is non-pyrogenic andfree of viruses or other microorganisms. Any suitable carrier known tothose of ordinary skill in the art may be used. Representative carriersinclude, but are not limited to: physiological saline solutions,gelatin, water, alcohols, natural or synthetic oils, saccharidesolutions, glycols, injectable organic esters such as ethyl oleate or acombination of such materials. Optionally, a pharmaceutical compositionadditionally contains preservatives and/or other additives such as, forexample, antimicrobial agents, anti-oxidants, chelating agents and/orinert gases, and/or other active ingredients.

[0158] Routes and frequency of administration, as well doses, will varyfrom patient to patient. In general, the pharmaceutical compositions isadministered intravenously, intraperitoneally, intramuscularly,subcutaneously, intracavity or transdermally. Between 1 and 6 doses isadministered daily. A suitable dose is an amount that is sufficient toshow improvement in the symptoms of a patient afflicted with a diseaseassociated an aberrant level of expression of a particular protein orthe site or amount of modification of the protein. Such improvement maybe detected by monitoring appropriate clinical or biochemical endpointsas is known in the art. In general, the amount of modulating agentpresent in a dose, or produced in situ by DNA present in a dose (e.g.,where the modulating agent is a polypeptide or peptide encoded by theDNA), ranges from about 1 μg to about 100 mg per kg of host. Suitabledose sizes will vary with the size of the patient, but will typicallyrange from about 10 mL to about 500 mL for 10-60 kg animal. A patientcan be a mammal, such as a human, or a domestic animal.

[0159] The methods herein can also be used to implement a variety ofclinical and diagnostic analyses to detect the presence, absence,deficiency or excess of a given protein or protein function in abiological fluid (e.g., blood), or in cells or tissue. The methods areparticularly useful in the analysis of complex mixtures of proteins,i.e., those containing 5 or more distinct proteins or protein functions.Therefore in one aspect, the methods are used to compare and quantitatelevels of proteins and/or sites and amounts of protein modifications insamples between a normal cell sample and a cell sample from a patientwith a pathological condition (preferably, the cell sample is the targetof the pathological condition) in order to identify the presence,absence, deficiency or excess of a given protein or protein functionwhich is associated with the pathological condition.

[0160] Kits

[0161] The invention further provides a kit comprising reagents and/orcompositions as described above. For example, in one aspect theinvention provides a tag molecule and one or more of a reagent selectedfrom the group consisting of: an activating agent for providing activegroups on a protein which bind to the reactive site of the tag molecule;a solid phase; one or more agents for lysing a cell; a pH controllingagent; a reducing agent; one or more proteases; one or more cell samplesor fractions thereof. In one aspect, the tag molecule is further stablyassociated with a peptide, i.e., a tagged reference peptide is includedsuitable for a particular assay of choice.

[0162] The invention also provides kits comprising a plurality of taggedpeptide molecules, each tagged peptide molecule comprising a peptide anda tag molecule stably associated with the protein, the tag moleculefurther comprising an isotope label, and a pH and/or reduction sensitiveanchoring site for anchoring the tag molecule to a solid phase. In oneaspect, the kit comprises pairs of tagged peptides and each member of apair of tagged peptides comprises an identical peptide and isdifferentially labeled from the other member of the pair. In anotheraspect, the kit comprises at least one set of tagged peptides, the setcomprising different peptides corresponding to a single protein. Instill another aspect, at least one set of tagged peptides comprisespeptides corresponding to modified and unmodified forms of a singleprotein. In a further aspect, the kit comprises at least one set oftagged peptides from a first cell at a first cell state and at least oneset of tagged peptides from a second cell at a second cell state. Forexample, the first cell may be a normally proliferating cell while thesecond cell is an abnormally proliferating cell (e.g., a cancer cell).First and second cells may also represent different stages of cancer,different developmental stages, cells exposed to agents (e.g., drugs,potentially toxic or carcinogenic materials) or conditions (e.g., pH,temperature, nutrient levels, passage of times) and cells not exposed toagents or conditions, as well as cells which do or do not expressparticular recombinant DNA constructs.

EXAMPLES

[0163] The invention will now be further illustrated with reference tothe following examples. It will be appreciated that what follows is byway of example only and that modifications to detail may be made whilestill falling within the scope of the invention.

Example 1 Arylboronic Acids as New ICAT Reagents

[0164] Arylboronic Acid-Immobilized Glutathione on a CarbohydrateAffinity Column

[0165] A column of carbohydrate was immobilized on agarose (Calbiochem,gal-1-1,3-gal on agarose, cat. # 215364, 2 mls packed resin) using 0.05%SDS in 50 mM ammonium bicarbonate, pH=8.1; however, SDS may be omitted.The column was equilibrated with at least 10 column volumes of the 50 mMAmBic, without detergent, before sample was applied. An arylboronicconjugate was synthesized using standard chemistries. 68 mgs GSH in 1.9mls of water was combined with 100 μL of 1M potassium phosphate, pH=7.4and stirred for 5 minutes. 8.8 mgs of arylboronic acid were added whichdissolved within about 15 minutes.

[0166] The scheme for generating the conjugates is shown below:

[0167] One ml of AmBic (1M) was added and the solution was stirredanother 5 minutes, after which 100 μL of 150 μM fluoresceine was added.The column was washed with 50 mM AmBic solution at a flow rate of about1 ml/minute. Five ml fractions were collected and the amount offluorescein in the fractions was determined. A large amount offluoresceine initially eluted. After collecting fraction 9, elutionbuffer consisting of 100 mM glycine, pH=2.5, and containing 25 mMglucose was used to wash the column. Five ml fractions were collectedthrough column 15. Absorbance was determined at 254 and 490 nm, todetermine the presence of aryl groups and fluoresceine respectively, inthe fractions. The elution profile is shown in FIG. 4.

[0168] Fraction 10 showed significant amount of product. Fractions 10-12were combined and saved as a combined sample (combined sample 1) at −80°C. for LC-MS analysis, as were the flow-through fractions 3-6 (combinedsample 2). Thus, even without optimal conditions for recovery,significant amounts of product were recovered.

[0169] These results demonstrate that boronic acid conjugates can beused to provide pH sensitive molecular tags which can be recovered athigh efficiency.

Example 2 Biotin Derivatives as New Catch and Release Reagents

[0170] Preparation of new Biotin Derivatives

[0171] A series of new biotin based ICAT reagents are provided by theinvention which comprise a biotin residue and alkylating group which areconnected by a linker. Preferred alkylating groups are suitable foralkylating serine residues of polypeptides. Preferred Biotin derivativescomprise biotin and a2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid coupled through a di(2-aminoethyl)ether, which may have one or moreethylene glycol repeat units interposed between the amino residues,e.g., a linker of the formula: —NH((CH₂)₂O)_(n) (CH₂)₂NH—, where n is aninteger of from 0 to about 5.

[0172] Biotin derivatives can be prepared by coupling of biotin, linkergroup and alkylating group in sequential amide bond forming reactionsand amine deprotection steps. An illustrative synthesis is provided inthe scheme of FIG. 6 (Boc is C(O)O^(t)Bu, NHS is N-hydroxy succinimide,DCC is dicyclohexylcarbodiimide, DMF is N,N-dimethylformamide, and DIEAis diisopropylethylamine, and n is an integer from 0 to about 5).

[0173] Examples of biotin derivatives prepared in this manner include:

[0174] Experimental Conditions for Amide Coupling to Generate BiotinDerivative (IV)

[0175] Intermediate (III) (12.5 mgs) was combined with an excess ofiodo-acetic acid 2,5-dioxo-pyrrolidin-1-yl ester (about 6 mgs) inmethanol. One equivalent of DIEA (diluted in methanol) and additionalmethanol (about 0.5 mL) were added to the reaction mixture. Afterstirring for 30 minutes, an aliquot of reaction mixture spotted on athin layer chromatography plate did not stain when exposed to ninhydrinsolution indicating an absence of residual amino functionality. Thebiotin derivative (IV) was purified using reverse phase HPLC. (columnwas 1×25 cm, flow rate 1 ml/min, detection at 214 nm. Mobile phase A was5% acetonitrile (ACN) in H₂O, 0.06% trifluoroacetic acid (TFA), andmobile phase B was 95% ACN in H₂O, 0.06% TFA. Mobile phase compositionwas varied over elution time as follows: tomin, 5% B; t_(5min), 5% B;t_(20min), 50% B, t_(25min), 50% B; t_(26min), 100% B (4 mls/min);t_(28min), 100% B (4 mls/min); t_(30min), 5% B (3 mls/min).

[0176]FIG. 7 is an HPLC trace of the crude reaction mixture. Theoff-scale peak at about 22 minutes corresponded to the product (CompoundIV) and was isolated. The eluent corresponding to the 22 minute peakfrom several HPLC runs was combined an lyophylized in the dark withoutheating to afford compound (IV) as a fluffy white solid (8.3 mgs; 60%yield). FIG. 8 provides an LC-MS trace of the fluffy white powder whichhas a single peak at 762 which corresponded to the [M+H]⁺¹ peak of thebiotin derivative (IV).

Example 3 Preparation of Biotin conjugates

[0177] Conjugates of biotin derivative (IV) with glutatione may beprepared according to the scheme shown below:

[0178] A 25 mM stock solution of the Biotin derivative (IV) in DMSO wasprepared and then diluted with 50 mM Tris buffer, pH=8.0 to a finalconcentration of 250 μM. A 150 μM stock solution of glutatione (GSH) in50 mM Tris buffer was also prepared. The Biotin derivative (IV) stocksolution and the GSH stock solution were mixed (1:1 by volume) andincubated for thirty minutes.

[0179] A reference solution of the Biotin derivative (IV) was preparedby diluting the 25 mM DMSO solution with 50 mM Tris buffer, pH=8.0 to afinal concentration of 125 μM.

[0180]FIG. 9 provides reverse phase HPLC traces of the reaction mixtureand reference solution. The Reverse-phase HPLC analysis was conductedusing a 2.0×250 mm column. The peak at about 21.5 minutes corresponds tounreacted biotin derivative (IV) and the product GSH conjugate eluted atabout 18 minutes.

Example 4 Polypeptide-Biotin Conjugates

[0181] A conjugate between biotin derivative IV and a peptidecorresponding to the active site (residues 200-221) of human proteintyrosine phospatase IB (PTP1B) having a sequence ESGSLSPEHGPVVVHCSAGIGRand (MS trace shown in FIG. 10: [M+H]⁺¹=2176.4 and [M+2H]⁺²=1088.7) wasprepared and purified. The polypeptide was tagged at cysteine-215

[0182] Step 1. Synthesis

[0183] A 44 μL aliquot of a stock solution of PTP1B (10 nmol) wasdiluted with 454 μL of 100 mM ammonium carbonate (pH=8.0) containing 10%methanol. A five fold excess of Biotin derivative (IV) (2 μL of a 25 mMDMSO solution, 50 nmol) was added to the reaction mixture (finalvolume=500 μL). After incubating the reaction mixture for 15 minutes atroom temperature, 5 μL of D-penicillamine (3,3-dimethyl-D-cysteine;D-PEN) (100 mM stock solution, 500 nmol) was introduced into thereaction mixture to consume residual Biotin derivative (IV).

[0184]FIG. 11 is an HPLC trace of the reaction mixture had a peak at14.24 minutes corresponding to the conjugate. The mass spectrum shown inFIG. 12 includes peaks corresponding the [M+2H]⁺² and the [M+3H]⁺³ peaksat 1405.5 and 937 atomic mass units. Various sodium adducts for eachpeak are also present in the Mass spectrum of FIG. 12.

[0185] The HPLC trace shown in FIG. 11 also contains peaks correspondingto unreacted PTPIB (11.44 minutes) and an adduct which may result fromcoupling of the D-PEN and non-conjugated peptide. One skilled in the artwill recognize that extended incubation times and other processoptimization may be beneficial to maximize the yield of the desiredconjugate and reduce by-product formation.

[0186] Tris(2-carboxyethyl)phosphine hydrochloride(TCEP) (5 mM finalconcentration) was added to the reaction mixture as a reductant. FIG. 13is a MS chromatograph trace of the reduced reaction mixture. Thesolution is a mixture of PTPIB conjugate and various byproductsincluding a conjugate of D-PEN.

[0187] Step 2: Purification of the conjugate

[0188] Two tubes were charged with 150 μL aliquots of the reactionmixture prepared in Step 1 (each aliquot contained the desiredconjugate, about 15 nmol total biotin, and roughly 3 nmol test peptide).Packed immobilized avidin (600 μL having about 24 mmol total biotinbinding capacity) was introduced into each tube and the heterogeneousmixtures were agitated for 20 minutes at room temperature. The mixtureswere transferred to a spinning filter and the liquid phase removed byfiltration. The beads were washed with 300 μL of an aqueous ammoniumbicarbonate buffer containing 10% methanol and the supernatant removedin a centrifuge. The washing cycle was repeated twice (three total washcycles) before the beads were resuspended in a clean ammoniumbicarbonate buffer solution containing 5 mM TCEP. The solution wasincubated at 40° C. for 45 minutes and then the liquid phase wascollected using a spinning filter. The beads were washed and filteredtwice with 100 μL aliquots of fresh ammonium bicarbonate buffercontaining no additional TCEP. The TCEP containing liquid phase andsubsequent washings were combined and the washings combined with theoriginal solution to afford the conjugate in 780 μL of solution. Asample was diluted four fold with mass spec loading buffer and 2 μL ofthis solution analyzed by LC-MS. The LC trace of the LC-MS analysis hada single peak at 11.81 minutes and is depicted in FIG. 14. The MSspectrogram of the peak at 11.81 minutes is depicted in FIG. 15 andexhibited a a [M+H]⁺¹ of 1161 corresponding to the conjugate of PTP1B.

Example 5 Tagged Peptide

[0189] A particularly useful biotin release reagent is illustrated inFIG. 16A. It can be noted that this is Biotin derivative IV where n=2.The tag is made light and also heavy with C¹³ atoms for labeling the Cysresidue in proteins. Fragmentation of a protein is not affected by thetag. Pairs of tagged peptides are provided by digestion and cleavage.See, e.g., the peptide ESGSLSPEHGPVVVHCSAGIGR as illustrated in FIG.16B.

[0190] This biotin release reagent can be treated for 30 minutes in 5 mMTCEP as described above, obtaining 95% cleavage yield.

REFERENCES

[0191] Ashikaga, K. et al. (1988) Bull. Chem. Soc. Jpn. 61:2443-2450.

[0192] Bayer, E. and Wilchek, M. (eds.) “Avidin=Biotin Technology,”(1990) Methods Enzymol. 184:49-51.

[0193] Bleasby, A. J. et al. (1994), “OWL—a non-redundant compositeprotein sequence database,”Nucl. Acids Res. 22:3574-3577.

[0194] Boucherie, H. et al. (1996), “Two-dimensional gel proteindatabase of Saccharomyces cerevisiae,” Electrophoresis 17:1683-1699.

[0195] Brockhausen, I.; Hull, E.; Hindsgaul, O.; Schachter, H.; Shah, R.N.; Michnick, S. W.;

[0196] Carver, J. P. (1989) Control of glycoprotein synthesis. J. Biol.Chem. 264,11211-11221.

[0197] Chapman, A.; Fujimoto, K.; Kornefeld, S. (1980) The primaryglycosylation defect in class E Thy-1-negative mutant mouse lymphomacells is an inability to synthesize dolichol-P-mannose. J. Biol. Chem.255, 4441-4446.

[0198] Chen, Y.-T. and Burchell, A. (1995), The Metabolic and MolecularBases of Inherited Disease, Scriver, C. R. et al. (eds.) McGraw-Hill,N.Y., pp. 935-966.

[0199] Clauser, K. R. et al. (1995), “Rapid mass spectrometric peptidesequencing and mass matching for characterization of human melanomaproteins isolated by two-dimensional PAGE,” Proc. Natl. Acad. Sci. USA92:5072-5076.

[0200] Cole, R. B. (1997) Electrospray Ionization Mass Spectrometry:Fundamentals, Instrumentation and Practice, Wiley, N.Y.

[0201] De Leenheer, A. P. and Thienpont, L. M. (1992), “Application ofisotope dilution-mass spectrometry in clinical chemistry,pharmacokinetics, and toxicology,” Mass Spectrom. Rev. 11:249-307.

[0202] DeRisi, J. L. et al. (1997), “Exploring the metabolic and geneticcontrol of gene expression on a genomic scale,” Science 278:680-6

[0203] Dongr'e, A. R., Eng, J. K., and Yates, J. R., 3rd (1997),“Emerging tandem-mass-spectrometry techniques for the rapididentification of proteins,” Trends Biotechnol. 15:418-425.

[0204] Ducret, A., VanOostveen, I., Eng, J. K., Yates, J. R., andAebersold, R. (1998), “High throughput protein characterization byautomated reverse-phase chromatography/electrospray tandem massspectrometry,” Prot. Sci. 7:706-719.

[0205] Eng, J., McCormack, A., and Yates, J. I. (1994), “An approach tocorrelate tandem mass spectral data of peptides with amino acidsequences in a protein database,” J. Am. Soc. Mass Spectrom. 5:976-989.

[0206] Figeys, D. et al. (1998), “Electrophoresis combined with massspectrometry techniques: Powerful tools for the analysis of proteins andproteomes,” Electrophoresis 19:1811-1818.

[0207] Figeys, D., and Aebersold, R. (1998), “High sensitivity analysisof proteins and peptides by capillary electrophoresis tandem massspectrometry: Recent developments in technology and applications,”Electrophoresis 19:885-892.

[0208] Figeys, D., Ducret, A., Yates, J. R., and Aebersold, R. (1996),“Protein identification by solid phase microextraction-capillary zoneelectrophoresis-microelectrospray-tandem mass spectrometry,” NatureBiotech. 14:1579-1583.

[0209] Figeys, D., Ning, Y., and Aebersold, R. (1997), “Amicrofabricated device for rapid protein identification bymicroelectrospray ion trap mass spectrometry,” Anal. Chem. 69:3153-3160.

[0210] Freeze, H. H. (1998) Disorders in protein glycosylation andpotential therapy. J. Pediatrics 133, 593-600. Freeze, H. H. (1999)Human glycosylation disorders and sugar supplement therapy. Biochem.Biophys. Res. Commun. 255, 189-193.

[0211] Gamper, H. B., “Facile preparation of nuclease resistant3′-modified oligodeoxy-nucleotides,” Nucl. Acids Res., 21:145-150(January 1993)

[0212] Garrels, J. I., McLaughlin, C. S., Warner, J. R., Futcher, B.,Latter, G. I., Kobayashi, R., Schwender, B., Volpe, T., Anderson, D. S.,Mesquita, F.-R., and Payne, W. E. (1997), “Proteome studies ofSaccharomyces cerevisiae: identification and characterization ofabundant proteins. Electrophoresis,” 18:1347-1360.

[0213] Gerber, S. A.; Scott, C. R.; Turecek, F.; Gelb, M. H. (1999)Analysis of rates of multiple enzymes in cell lysates by electrosprayionization mass spectrometry. J. Am. Chem. Soc. 121, 1102-1103.

[0214] Glaser, L. (1966) Phosphomannomutase from yeast. In Meth.Enzymol. Vol. Vil, Neufeld, E. F.; Ginsburg, V. Eds; Academic Press:N.Y. 1966, pp. 183-185.

[0215] Gygi, S. P. et al. (1999), “Correlation between portein and mRNAabundance in yeast,” Mol. Cell. Biol. 19:1720-1730.

[0216] Gygi, S. P. et al. (1999), “Protein analysis by mass spectrometryand sequence database searching: tools for cancer research in thepost-genomic era,” Electrophoresis 20:310-319.

[0217] Haynes, P. A., Fripp, N., and Aebersold, R. (1998),“Identification of gel-separated proteins by liquid chromatographyelectrospray tandem mass spectrometry: Comparison of methods and theirlimitations,” Electrophoresis 19:939-945.

[0218] Hodges, P. E. et al. (1999), “The Yeast Proteome Database (YPD):a model for the organization and presentation of genome-wide functionaldata,” Nucl. Acids Res. 27:69-73.

[0219] Johnston, M. and Carlson, M. (1992), in The Molecular andCellular Biology of the Yeast Saccharomyces, Johnes, E. W. et al.(eds.), Cold Spring Harbor Press, New York City, pp. 193-281.

[0220] Kataky, R. et. al. J Chem Soc Perk T 2 (2) 321-327 FEB 1990.

[0221] Kaur, K. J.; Hingsgaul, 0. (1991) A simple synthesis of octyl3,6-di-O-(.alpha.-D-mannopyranosyl)-.beta.-D-manopyranoside and its useas an acceptor for the assay of N-acetylglucosaminetransferase Iactivity. Glycoconjugate J. 8, 90-94.

[0222] Kaur, K. J.; Alton, G.; Hindsgaul, 0. (1991) Use ofN-acetylglucosaminyltranserases 1 and 11 in the preparative synthesis ofoligosaccharides. Carbohydr. Res. 210, 145-153.

[0223] Korner, C.; Knauer, R.; Holzbach, U.; Hanefeld, F.; Lehle, L.;von Figura, K. (1998) Carbohydrate-deficient glycoprotein syndrome typeV: deficiency of dolichyl-P-Glc:Man9GlcNAc2-PP-dolichylglucosyltransferase. Proc Natl Acad Sci U.S.A. 95, 13200-13205.

[0224] Link, A. J., Hays, L. G., Carmack, E. B., and Yates, J. R., 3rd(1997), “Identifying the major proteome components of Haemophilusinfluenzae type-strain NCTC 8143,” Electrophoresis 18:1314-1334.

[0225] Link, J. et al. (1999), “Direct analysis of large proteincomplexes using mass spectrometry,” Nat. Biotech. 17:676-682 (July 1999)

[0226] Mann, M., and Wilm, M. (1994), “Error-tolerant identification ofpeptides in sequence databases by peptide sequence tags,” Anal. Chem.66:4390-4399.

[0227] McMurry, J. E.; Kocovsky, P. (1984) A method for thepalladium-catalyzed allylic oxidation of olefins. Tetrahedron Lett. 25,4187-4190.

[0228] Morris, A. A. M. and Turnbull, D. M. (1994) Curr. Opin. Neurol.7:535-541.

[0229] Neufeld, E. and Muenzer, J. (1995), “The mucopolysaccharidoses”In The Metabolic and Molecular Bases of Inherited Disease, Scriver, C.R. et al. (eds.) McGraw-Hill, New York, pp. 2465-2494.

[0230] Oda, Y. et al. (1999), “Accurate quantitation of proteinexpression and site-specific phosphorylation,” Proc. Natl. Acad. Sci.USA 96:6591-6596.

[0231] Okada, S. and O'Brien, J. S. (1968) Science 160:10002.

[0232] Opiteck, G. J. et al. (1997), “Comprehensive on-line LC/LC/MS ofproteins,” Anal. Chem. 69:1518-1524.

[0233] Paulsen, H.; Meinjohanns, E. (1992) Synthesis of modifiedoligosaccharides of N-glycoproteins intended for substrate specificitystudies of N-acetylglucosaminyltransferases II-V Tetrahedron Lett. 33,7327-7330.

[0234] Paulsen, H.; Meinjohanns, E.; Reck, F.; Brockhausen, I. (1993)Synthese von modifizierten Oligosacchariden der N-Glycoproteine zurUntersuchung der Spezifitat der N-Acetylglucosaminyltransferase II.Liebigs Ann. Chem. 721-735.

[0235] Pennington, S. R., Wilkins, M. R., Hochstrasser, D. F., and Dunn,M. J. (1997), “Proteome analysis: From protein characterization tobiological function,” Trends Cell Bio. 7:168-173.

[0236] Preiss, J. (1966) GDP-mannose pyrophosphorylase fromArthrobacter. In Meth. Enzymol. Vol. Vill, Neufeld, E. F.; Ginsburg, V.Eds; Academic Press: New York 1966, pp. 271-275.

[0237] Qin, J. et al. (1997), “A strategy for rapid, high-confidenceprotein identification,” Anal. Chem. 69:3995-4001.

[0238] Ronin, C.; Caseti, C.; Bouchilloux, C. (1981) Transfer of glucosein the biosynthesis of thyroid glycoproteins. I. Inhibition of glucosetransfer to oligosaccharide lipids by GDP-mannose. Biochim. Biophys.Acta 674, 48-57.

[0239] Ronin, C.; Granier, C.; Caseti, C.; Bouchilloux, S.; VanRietschoten, J. (1981 a) Synthetic substrates for thyroidoligosaccharide transferase. Effects of peptide chain length andmodifications in the -Asn-Xaa-Thr- region. Eur. J. Biochem. 118,159-164.

[0240] Ronne, H. (1995), “Glucose repression in fungi,” Trends Genet.11:12-17.

[0241] Rush, J. S.; Wachter, C. J. (1995) Transmembrane movement of awater-soluble analogue of mannosylphosphoryldolichol is mediated by anendoplasmic reticulum protein. J. Cell. Biol. 130, 529-536.

[0242] Schachter, H. (1986) Biosynthetic controls that determine thebranching and microheterogeneity of protein-bound oligosaccharides.Biochem. Cell Biol. 64, 163-181.

[0243] Scriver, C. R. et al. (1995), The Metabolic and Molecular Basesof Inherited Disease,

[0244] Scriver, C. R. et al. (eds.) McGraw-Hill, N.Y., pp. 1015-1076.

[0245] Sechi, S. and Chait, B. T. (1998), “Modification of cysteineresidues by alkylation. A tool in peptide mapping and proteinidentification,” Anal. Chem. 70:5150-5158.

[0246] Segal, S. and Berry, G. T. (1995), The Metabolic and MolecularBases of Inherited Disease, Scriver, C. R. et al. (eds.), McGraw-Hill,N.Y., pp. 967-1000.

[0247] Romanowska, A. et al. (1994), “Michael Additions for Synthesis ofNeoglycoproteins,” Methods Enzymol. Neoconjugates Part A (Synthesis)242:90-101.

[0248] Roth, F. P. et al. (1998), “Finding DNA regulatory motifs withinunaligned noncoding sequences clustered by whole-genome mRNAquantitation,” Nat. Biotechnol. 16:939-945.

[0249] Shalon, D., Smith, S. J., and Brown, P. O. (1996), “A DNAmicroarray system for analyzing complex DNA samples using two-colorfluorescent probe hybridization,” Genome Res. 6:639-645.

[0250] Shevchenko, A., Jensen, 0. N., Podtelejnikov, A. V., Sagliocco,F., Wilm, M., Vorm, O., Mortensen, P., Shevchenko, A., Boucherie, H.,and Mann, M. (1996), “Linking genome and proteome by mass spectrometry:large-scale identification of yeast proteins from two dimensional gels,”Proc. Natl. Acad. Sci. U.S.A. 93:14440-14445.

[0251] Shevchenko, A., Wilm, M., Vorm, O., and Mann, M. (1996), “Massspectrometric sequencing of proteins silver-stained polyacrylamidegels,” Anal. Chem. 68:850-858.

[0252] Tan, J.; Dunn, J.; Jaeken, J.; Schachter, H. (1996) Mutations inthe MGAT2 gene controlling complex glycan synthesis cause carbohydratedeficient glycoprotein syndrome type II, an autosomal recessive diseasewith defective brain development. Am. J. Hum. Genet. 59, 810-817.

[0253] Velculescu, V. E., Zhang, L., Zhou, W., Vogelstein, J., Basrai,M. A., Bassett, D. E., Jr., Hieter, P., Vogelstein, B., and Kinzler, K.W. (1997), “Characterization of the yeast transcriptome,” Cell88:243-251.

[0254] Wilbur, D. S. et al. (1997), “Biotin reagents for antibodypretargeting. Synthesis, radioiodenation and in vitro evaluation ofwater soluble, biotimidase resistant biotin derivatives,” BioconjugateChem. 8:572-584.

[0255] Yates, J. R. d., Eng, J. K., McCormack, A. L., and Schieltz, D.(1995), “Method to correlate tandem mass spectra of modified peptides toamino acid sequences in the protein database,” Anal. Chem. 67:1426-1436.

[0256] Variations, modifications, and other implementations of what isdescribed herein will occur to those of ordinary skill in the artwithout departing from the spirit and scope of the invention asdescribed and claimed herein.

[0257] All of the references identified hereinabove, are expresslyincorporated herein by reference.

1 1 1 22 PRT Homo sapiens 1 Glu Ser Gly Ser Leu Ser Pro Glu His Gly ProVal Val Val His Cys 1 5 10 15 Ser Ala Gly Ile Gly Arg 20

What is claimed is:
 1. A reagent for mass spectrometric analysis ofproteins comprising a tag molecule, wherein the tag molecule comprises areactive site for stably associating with a protein, an isotope label,and a biotin compound linked to the tag molecule through a pH orreducing agent sensitive bond.
 2. The reagent of claim 1, wherein thebiotin compound comprises biotin and a2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid coupled through a di(2-aminoethyl)ether.
 3. The reagent of claim 1,wherein the di(2-aminoethyl)ether comprises one or more ethylene glycolrepeat units interposed between the amino residues.
 4. The reagent ofclaim 1, wherein the biotin and the2-[2-(2-iodo-acetylamino)-1,1-dimethyl-ethyldisulfanyl]-3-methyl-butyricacid are coupled through a linker having the formula:—NH((CH₂)₂O)_(n)(CH₂)₂NH—, where n is an integer of from 0 to about 5.5. The reagent according to claim 1, wherein the anchoring site of thetag molecule forms covalent bonds to a cis hydroxyl pair under selectedpH conditions.
 6. The reagent according to claim 1, wherein theanchoring site of the tag molecule forms a covalent bond to a serineresidue.
 7. The reagent according to claim 1, wherein the isotope isselected from the group consisting of a stable isotope hydrogen, astable isotope of nitrogen, a stable isotope of oxygen, a stable isotopeof carbon, a stable isotope of phosphorous and a stable isotope ofsulfur.
 8. The reagent according to claim 1, wherein the reactive siteof the tag molecule is stably associated with a protein.
 9. The reagentaccording to claim 1, wherein the reactive site of the tag molecule isstably associated with a peptide.
 10. The reagent according to claim 1,wherein the reactive site group is selected from the group consisting ofa chemical moiety which reacts with sulfhydryl groups, a moiety thatreacts with amino groups, a moiety that reacts with carboxylate groups,a moiety that reacts with ester groups, a phosphate reactive group, analdehyde reactive group, a ketone reactive group and a moiety thatreacts with homoserine lactone after fragmentation with CNBr.
 11. Thereagent according to claim 1, wherein the pH sensitive anchoring groupforms a bond with a solid phase under selected pH conditions and whereinthe bond is selected from the group consisting of an acyloxyalkyl etherbond, acetal bond, thioacetal bond, aminal bond, imine bond, carbonatebond, ketal bond and disulfide bond.
 12. The reagent according to claim1, wherein the tag molecule is attached to a solid phase.
 13. Thereagent according to claim 12, wherein the tag molecule is attached to asolid phase through an avidin/biotin complex.
 14. The reagent accordingto claim 1, wherein the tag molecule is attached to a solid phasethrough an avidin/biotin complex.
 15. The reagent according to claim 1,wherein the tag molecule is about 175-300 daltons.
 16. The reagentaccording to claim 3, wherein the isotope is covalently bound to the tagmolecule.
 17. The reagent according to claim 1, wherein the reactivesite forms stable associations with a modified residue of a protein. 18.The reagent according to claim 17, wherein the modified residue isglycosylated, methylated, acylated, phosphorylated, ubiquinated,farnesylated, or ribosylated.
 19. A composition comprising a pair of tagmolecules according to claim 1, wherein each member of the pair isidentical except for the mass of the isotope attached thereto.
 20. Thecomposition according to claim 19, wherein one member of the paircomprises a heavy isotope and the other member of the pair comprises thecorresponding light form of the isotope.
 21. A composition, comprising areagent for mass spectrometric analysis of proteins comprising a firstand second tag molecule, wherein the first tag molecule comprises areactive site for stably associating with a protein, an isotope label,and a biotin compound linked to the tag molecule through a pH sensitivebond, the biotin compound providing an anchoring site for anchoring thetag molecule to a solid phase and the second tag molecule is identicalto the first tag molecule but does not comprise an isotope label.
 22. Akit comprising at least one reagent according to claim 1, and one ormore of a reagent selected from the group consisting of: an activatingagent for providing active groups on a protein which bind to thereactive site of the tag molecule; a solid phase; one or more agents forlysing a cell; a pH altering agent; one or more proteases; one or morecell samples or fractions thereof.
 23. A kit according to claim 22,wherein the tag molecule further comprises a peptide.
 24. A kitcomprising at least one reagent according to claim 21, and one or moreof a reagent selected from the group consisting of: an activating agentfor providing active groups on a protein which bind to the reactive siteof the tag molecule; a solid phase; one or more agents for lysing acell; a pH altering agent; one or more proteases; one or more cellsamples or fractions thereof.
 25. A kit according to claim 24, whereinthe tag molecule further comprises a peptide.
 26. A kit comprising aplurality of tagged peptide molecules, each tagged peptide moleculecomprising a peptide and a tag molecule stably associated with theprotein, the tag molecule further comprising an isotope label, and abiotin compound linked to the tag molecule through a pH sensitive bond,the biotin compound providing anchoring of the tag molecule to a solidphase.
 27. The kit according to claim 26, wherein the kit comprisespairs of tagged peptides and wherein each member of a pair of taggedpeptides comprises an identical peptide and each member of the pair isdifferentially labeled.
 28. The kit according to claim 26, comprising atleast one set of tagged peptides; comprising different peptidescorresponding to a single protein.
 29. The kit according to claim 26,comprising at least one set of tagged peptides comprising peptidescorresponding to modified and unmodified forms of a single protein. 30.The kit according to claim 26, comprising at least one set of taggedpeptides from a first cell at a first cell state and at least one set oftagged peptides from a second cell at a second cell state.
 31. The kitaccording to claim 30, wherein the first cell is a normallyproliferating cell and the second cell is an abnormally proliferatingcell.
 32. The kit according to claim 30, wherein the first and secondcells represent different stages of cancer.
 33. A method for identifyingone or more proteins or protein functions in one or more samplescontaining mixtures of proteins comprising: reacting a sample with afirst reagent according to claim 1 and a solid phase under conditionssuitable to form a solid phase-isotope labeled tag molecule-proteincomplex; digesting the complex with one or more proteases, therebygenerating solid phase-isotope labeled tag molecule-peptide complexesand untagged peptides; purifying the solid phase-isotope labeled tagmolecule-peptide complexes; exposing the solid phase-isotope labeled tagmolecule-peptide complexes to a pH or a reducing agent which disruptsassociations between the anchoring site of the tag molecule and thesolid phase, thereby releasing a tagged peptide from the solid phase;determining the mass of the tagged peptide; and correlating the mass tothe identity and/or activity of a protein.
 34. The method according toclaim 33, wherein the mass-to-charge ratio of the tagged peptide isdetermined.
 35. The method according to claim 33, further comprisingsubjecting a sample comprising one or more tagged peptides to aseparation step.
 36. The method according to claim 35, wherein theseparation step comprises liquid chromatography.
 37. The methodaccording to claim 36, comprising subjecting one or more tagged peptidesto MS^(n) analysis.
 38. The method according to claim 28, furthercomprising reacting a second sample with a second reagent comprising anidentical molecular tag as the first reagent but which is differentiallylabeled.
 39. The method according to claim 38, further comprisingcombining the two samples prior to protease digestion and generating acombined sample comprising at least one pair of tagged peptides, eachmember of the pair comprising identical peptides but differing in mass.40. The method according to claim 39, comprising determining the ratioof members of at least one tagged peptide pair in the combined sample.41. The method according to claim 40, further comprising generating massspectra comprising at least one signal doublet for each peptide in thesample, the signal doublet comprising a first signal and a second signalshifted a number of known units from the first signal, wherein the knownunits represent the difference in molecular weight between the twomembers of a tagged peptide pair.
 42. The method according to claim 41,further comprising determining a signal ratio for a given peptide byrelating the difference in signal intensity between the first signal andthe second signal.
 43. The method according to claim 33, furthercomprising the step of relating mass spectra data from a tagged peptideto an amino acid sequence.
 44. The method according to claim 38, furthercomprising the step of relating mass spectra data from a tagged peptideto an amino acid sequence.
 45. The method according to claim 33, whereinthe steps of the method are repeated, either sequentially orsimultaneously, until substantially all of the proteins in a sample aredetected and/or identified.
 46. The method according to claim 38,wherein the relative amounts of members of a tagged peptide pair in thetwo samples are determined and correlated with the abundance the proteincorresponding to the peptide in the sample.
 47. The method according toclaim 46, further comprising correlating the relative abundance of theprotein with the state of the cells.
 48. The method according to claim47, wherein correlating is used to diagnose a pathological condition ina patient from whom one of the cell samples was obtained.
 49. The methodaccording to claim 33, comprising determining the quantity of a proteincorresponding to the peptide in the sample.
 50. The method according toclaim 33, comprising determining the site of a modification of a proteinin one or more samples, by reacting sample proteins with a tag moleculecomprising a reactive site which reacts with a modified residue on theprotein.
 51. The method according to claim 38, comprising determiningthe site of a modification of a protein in one or more samples, byreacting sample proteins with a tag molecule comprising a reactive sitewhich reacts with a modified residue on the protein.
 52. The methodaccording to claim 48, further comprising determining the amount ofmodified protein in the sample.
 53. The method according to claim 33,wherein the exposing step utilizes a reducing agent comprising aphosphine.
 54. The method according to claim 53, wherein the phosphinecomprises TCEP.